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(CXR) by applying deep learning-based techniques. The CXR will be 
classified into three different types, i.e. (i) normal, (ii) COVID-19, and (iii) 
pneumonia. The classification challenge is raised when the X-ray images of 
COVID-19 and pneumonia are subtle. The CXR images of the chest are first 
proceeded to be standardized and to improve the visual contrast of the 
images. Then, the classification is performed by applying a deep learning- 
based technique that binds two deep learning network architectures, i.e., 
convolution neural network (CNN) and long short-term memory (LSTM), to 
generate a hybrid model for the classification problem. The deep features of 
the images are extracted by CNN before the final classification is performed 
using LSTM. In addition to the hybrid models, this work explores the 
validity of image pre-processing methods that improve the quality of the 


images before the classification is performed. The experiments were 
conducted on a public image dataset. The experimental results demonstrate 
that the proposed technique provides promising results and is superior to the 
baseline techniques. 
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1. INTRODUCTION 

In 2019, the emergence of the Coronaviruses or COVID-19 has undergone and changed people's 
lives all over the world leading to a global crisis. In April 2021, from world-meter [1], the COVID-19 is 
affecting 220 countries and territories. Now, the total cases are 150,242,628 cases, separated into 3 groups: i) 
3,164,170 death cases, ii) 127,775,690 recovered cases, and others in the hospital. Many researchers 
proposed predicting cases of the COVID-19 [2]. Tuli et al. [3] proposed a prediction of the growth and trend 
of COVID-19 using machine learning (ML) and Robust Weibull model based on iterative weighting. 
Moreover, from the World Health Organization (WHO) report [4]. The preliminary symptoms of a person 
who has the COVID-19 include fever, cough, breathing problems, and difficulties. In some cases, the 
infection can result in pneumonia, severe acute respiratory syndrome, kidney failure, and even death. Unlike 
normal influenza, the COVID-19 can cause lasting lung damage which takes hold in both lungs [5]. The 
recovery of it takes time, possibly three months to a year. Ritter et al. [6] proposed a simple statistical model 
for predicting the level of intensive care load in exponential phases of the disease. The research suggested the 
model for predicting the ICU rate, which was at 5-18% with an average of 12 days depending on the area. 
The long periods of using ICU required special care and treatment from the medical team such as investing, 
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disease examination and diagnosis. The main effect in humans appears in lungs displayed in X-radiation 
(X-ray) images. Conventional analysis of X-ray images can be a workload for doctors and radiologists when 
there are a huge number of cases. Many researchers tried to develop the research for solving and helping this 
problem. Pham et al. [7] proposed the survey on artificial intelligence (AI) and big data for COVID-19 
pandemic. This paper proposed the overview of AI and big data and identified the application aimed at 
alleviating against COVID-19, for example, rapid drug strategy, drug discovery, computed tomography (CT) 
image processing, X-ray reports, and classification using many techniques such as deep learning, case 
history, and prediction the outbreak. Recently, a number of machine learning and computer vision have been 
proposed to provide an automated process to classify the X-ray images (CXR) [8]-[19]. Deep learning is one 
of the favored techniques that are applied. Different deep learning architectures are applied to solve the 
classification problem of CRX of chest images such as convolution neural networks (CNNs) [8], [10], [11], [14]. 
Deep learning can generally produce promising results in classifying CRX images, compared to the 
conventional handcrafted feature techniques [10], [11]. Apart from a single model for the classification, 
ensemble learning-based is one of the techniques that have been applied for the CRX classification of 
COVID-19. The stack ensemble technique was introduced to improve the classification results [15]-[17]. 
However, the ensemble technique can be prone to interpretation issues and computational burden [18]. 
Zargari el. al. improved the image quality by applying an image normalization technique aiming to 
standardize the images before the classification was conducted [19]. They demonstrated a marginal 
improvement over the classification without the image improvement method. 

This work proposes a technique for classifying X-ray images of the chest into three types, i.e. (i) 
normal, (ii) COVID-19, and (iii) pneumonia. The classification challenge is raised when COVID-19 and 
pneumonia are subtle. In addition, uneven image brightness and poor contrast of the CXR can degrade the 
images and make the classification more intractable. Therefore, in this work, the images are standardized 
using a look and feel transfer method. The distribution of color intensity of the images is mapped to one of 
the predetermined image templates. Then, the contrast of the image will be improved. For the classification, 
this work applies a deep learning-based technique that binds two deep learning networks, convolution neural 
network (CNN) and long short-term memory (LSTM), to a hybrid architecture to perform the classification. 
A deep feature (as an abstract feature) of the images is imposed by a CNN before the final classification is 
performed using LSTM, which determines the local context of feature vectors). In addition to the hybrid 
models, this work explores the validity of image pre-processing methods for improving the quality of the 
images before the classification is performed. 

The rest of the paper is organized as follows: Section 2 explains the proposed method of classifying 
the X-ray images, including proving the details of the baseline methods. Section 3 demonstrates the 
experiments conducted to evaluate the proposed methods and the results obtained from the experiment. The 
last section. Section 4, provides a comprehensive discussion of the work before the conclusion is given in this 
last section. 


2. METHOD 

The objective of this work is to classify the chest X-ray (CXR) images into different chest 
radiographs mainly focused on some specific diseases and normal chest (e.g., i) normal, ii) COVID-19, and 
iii) pneumonia). Therefore, this section describes in sufficient detail the proposed method for the 
classification task. Starting with an overview of the, depicted in Figure 1. through to the different 
components of the method, which is as follows: 


Data collection 


Data Preprocessing 


VGG-16 Rest 50 Inception Our proposed Our proposed 
methods 1 methods 2 
Performance analysis 


Figure 1. Overview of the work 
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2.1. Data collection 

Our data were collected from a public dataset [20], [21]. The data consisted of 3 classes: i) Normal 
class ii), COVID-19, and iii) Pneumonia class, which is depicted in Figure 2. The data summarization is 
demonstrated in Table 1. 


PPG PAP PA 


(c) 


Figure 2. An example of different radiological types of the X-ray images used in this work: (a) normal, 
(b) COVID-19, and (c) pneumonia 


Table 1. The experimental data 


Target (Class) 

Sel Ton Normal COVID-19 Pneumonia 
Training 544 184 160 200 
Validate 136 46 40 50 

Test 150 50 50 50 


2.2. Data pre-processing 
Each of the images proceeds to improve its quality. This work applies two techniques to improve the 
quality of the images and standardize the data. 

- Image standardization: this process is to normalize all images in the data, as to reduce the variation (light 
and luminance) of the images undergone during the acquiring process. This work applies a technique of 
the image transformation where the look and feel of the images in the dataset are shifted to the look and 
feel of a predetermined image, so call a template image (T) [22]. Therefore, the standardized image (Iç) 
can be obtained by: 


=-( Sr 
= (Zxm) +m (1) 
where, 


m=I- p (2) 


Ly and upy are the average intensity of pixel values of an input image and the template image. op and or are 

the variance intensity of pixel values of an input image and the template image, accordingly. An example of 

the standardized images is demonstrated in Figure 3. 

- Contrast enhancement: the contrast of the images (/,) is obtained using contrast limited adaptive 
histogram equalization (CLAHE) [23]. The input images are standardized before the contrast of the image 
is enhanced, shown in Figure 4. 


Figure 3. An example of the image standardization using an image transformation technique, 3-leftmost columns 
are the original image and 3-rightmost columns demonstrate the standardized image of the different image classes 


ica Images Standardized Images 
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Standardized Images + Contrast Enhanced 


$! 


Figure 4. An example of the improved-contrast images, 3-leftmost columns are the original image and 3- 
rightmost columns demonstrate the contrast-enhanced images of the different image classes 


2.3. Deep learning architectures 

In classifying the image data, this work applies deep learning techniques; therefore, this section 
explains in sufficient detail different deep learning architectures that are implemented. Deep learning 
architectures are formulated as follows: 


lı = (W, * l-1 + by) (3) 


where 1 < l < L denotes the layer index and L is the predefined number of layers in networks. Jy is an input 
image, W, and b, are a set of parameters at a layer l, * denotes a convolution operator, and 6, is a layer- 
specific which is a non-linear function in general. The output of the last layer, p,, is input to a softmax 
function, resulting from a probability value of a given set of the target classes. Then, the classification can be 
performed by minimizing a loss function with respect to the network weights (w) as (4). 


Lf (p),W) (4) 


The loss function (L) determines the difference of the prediction obtained by the network (W) and 
the target of the images. This loss function can be implemented using different techniques, for instance, 
square loss, logistic loss, exponential loss, and hinge loss. Finally, an optimization technique is performed 
through training processes to generalize the networks. 

- The VGG-16 architecture: The VGG16 convolutional neural network model proposed by Simonyan and 
Zisserman [24]. The size of the input to the first layer is fixed to 224x224 pixels. The image is transited 
through a stack of the convolutional layers, used as the filters. In the stride configuration, a 1-pixel stride 
is used. Consequently, five max-pooling layers (with stride equal to 2) are used for spatial pooling, 
performing down-sampling. Max-pooling layers proceed with a 2x2-pixel window. Finally, there are fully 
connected layers which are three numbers with the channel size as 4096, 4096 and 1000, respectively. 

- The ResNet 50 architecture: The residual networks (ResNet) is a type of convolution neural network that 
is trained with more than 150 neural layers. RestNet was proposed by He ef al. in 2015 [25]. The 
advantage of the ResNet is the simplicity and practicality of usage. It can be applied in many tasks such 
as detection, segmentation, and identification. The features extracted from the RestNet layer can represent 
class-specific properties, which can provide promising performance as compared to features extracted 
from similar network architectures. 

- The Inception architecture: The Inception is a deep learning architecture that can aggregate multiple filter 
sizes. The network was proposed by Christian Szegedy in 2014 [26]. This architecture comprises with of 
all convolutional layers (ranging from 1x1, 3x3, and 5x5) with output filter backs that are aggregated into 
a single output vector. 


2.4. Hybrid method 

This section explains the proposed techniques that are used to classify the image data. The proposed 
architecture relies on 2-sub networks that are consequently connected, as depicted in Figure 5. A CNN is 
implemented to extract deep features posed in the images. LSTM is deployed as a part of the network to carry 
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out the classification. The final layer (1_f) of the CNN is decomposed to 1-D feature vectors before they are 


fed to LSTM [27]. An optimization is performed in the train process to obtain a generalized model for the 
classification. 
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Figure 5. The overall architecture of the proposed method for classifying an input image to the different 
image types, i.e. i) Normal, ii) COVID-19, and iii) Pneumonia 


3. EXPERIMENTS AND RESULTS 

Section 2 gives the details of the proposed technique. This section, consequently, demonstrates the 
experiments that were conducted, and the results obtained from the experiment, as to evaluate the 
performance of the proposed technique. Three deep learning architectures are deployed, i.e. i) VGG-12, ii) 
RestNet50, and iii) Inception-Net. In the training process, default parameters of each model are set, and the 
training was carried out using the training data. The test data comprises 150 images of the X-ray from 3 
classes of the X-ray types, see section 2.1 for the details. After the experiments were run, the results are 
demonstrated as the confusion matrix in Tables 2-4. 


Table 2. The performance of VGG16 


Actual class 


: Precision 
Normal COVID-19 Pneumonia 
Predicted class Normal 35 12 9 0.63 
COVID-19 9 30 10 0.61 
Pneumonia 6 8 31 0.70 
Recall 0.70 0.60 0.62 
Table 3. The performance of ResNet50 
Actual class a 
. Precision 
Normal COVID-19 Pneumonia 
Predicted class Normal 37 11 14 0.60 
COVID-19 8 28 5 0.68 
Pneumonia 5 11 31 0.66 
Recall 0.77 0.56 0.62 
Table 4. The performance of Inception 
Actual class 3 
: Precision 
Normal COVID-19 Pneumonia 
Predicted class Normal 36 10 10 0.64 
COVID-19 5 32 6 0.74 
Pneumonia 9 8 34 0.67 


Recall 0.72 0.64 0.68 


Tables 2-4 shows the results of the baseline techniques in classifying the X-ray images. The 
accuracy of the baseline techniques is around 65% of accuracy. The average recall of the baselines is at 66%. 
RestNet50 provides the lowest recall on the COVID-19 image class, while it can result in promising recall for 
the normal image class. 

In the previous experiment, the baseline techniques for the classification experimented and the 
results were obtained. Then, the proposals were evaluated. In addition to the original data, in this experiment, 
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all the images in the dataset were normalized and enhanced in terms of contrast, using the technique 
described in Section 2.2. The experiments were conducted, and the results are demonstrated in Tables 5 and 6. 


Table 5. The performance of the proposed method without image improvement process 
Actual class 


Normal COVID-19 Pneumonia Precision 
Predicted class Normal 37 8 9 0.69 
COVID-19 5 31 4 0.78 
Pneumonia 8 11 37 0.66 


Recall 0.74 0.62 0.74 


Table 6. The performance of the proposed method with image improvement process 
Actual class 


Normal COVID-19 Pneumonia Frosicn 
Predicted class Normal 40 14 8 0.65 
COVID-19 6 30 4 0.75 
Pneumonia 4 6 38 0.79 


Recall 0.80 0.60 0.76 


The results in Table 5 are obtained from the proposed method. The accuracy of the proposed method 
is around 71%. The average recall is also 71%. The recall of the proposed technique with the image 
improvement process gives the best result (80%) for the normal class, which positively responds to the 
results obtained from the baselines. It can also be observed that both the COVID-19 and pneumonia images 
are usually classified to normal image class when they are miss-classified. The comparison of the baselines 
and the proposed techniques are summarized in Table 7. 

The summarization of the results from the experiment in Table 7 demonstrated that the proposed 
technique is superior to the baseline technique on the original data (without the image improvement process). 
In addition, it can be observed that the image improvement process can increase the accuracy of the 
classification, which results in 72% of Fl-Score. 


Table 7. The comparative performance of the 5 methods 
Accuracy Recall Precision F1 


VGGI6 0.64 0.64 0.65 0.65 
ResNet50 0.64 0.64 0.65 0.64 
Inception 0.68 0.68 0.68 0.68 
Proposed method 0.70 0.70 0.71 0.70 
Proposed method + image improvement 0.72 0.72 0.73 0.72 


4. DISCUSSION AND CONCLUSION 

Chest X-ray images are radiological resources that can be used to determine the severity of 
respiratory diseases such as COVID-19 and pneumonia. To increase the reproducibility for the disease 
diagnosis, this proposes a classification technique that separates the X-ray images into different types 
(COVID-19, pneumonia, and normal). Deep learning-based techniques are used to perform the classification 
task. The baselines techniques are implemented, comprising VGG-16, RestNet50, and Inception. Then, the 
proposed technique is constructed. The technique combines two deep neural networks, which are CNN and 
LSTM. The CNN extracts abstract t-discriminative features from the image and the LSTM oversees the 
classification process, which can extract the local context of the features generated by the CNN. 

The experiment results conducted using a standard dataset show that the proposed technique using 
the hybrid network architecture yields the best results. In addition, it can be observed that the recall value of 
the images in the normal class is marginally good, for all classification techniques, compared to the other 
classes. The X-ray image of the normal class does not contain massively the white matter in the lung area. 
The image improvement process (both standardization and contrast improvement) can be the key to helping 
the classifiers to differentiate between the image class. However, considering the COVID-19 and pneumonia 
image class, the majority of missed classification is found when they are both classified to the normal image 
class. Therefore, there is a subtle difference between the normal image class the COVID-19 and the 
pneumonia image class. In addition, it can be remarked from the experimental results that there is a certain 
level of difference between the COVID-19 image class and the pneumonia class. 
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The future work will be aimed at investigating the extension of the proposed technique using the 
hybrid method. 2D input data for LSTM will also be implemented in future work. In addition, ensemble 
techniques that combine different deep learning architecture will be investigated in the future work. 
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